1,187 research outputs found

    Haplotype-aware Diplotyping from Noisy Long Reads

    No full text

    {MBG}: {M}inimizer-based sparse de {B}ruijn {G}raph construction

    Get PDF

    Read-based Phasing of Related Individuals

    No full text
    Motivation: Read-based phasing deduces the haplotypes of an individual from sequencing reads that cover multiple variants, while genetic phasing takes only genotypes as input and applies the rules of Mendelian inheritance to infer haplotypes within a pedigree of individuals. Combining both into an approach that uses these two independent sources of information—reads and pedigree—has the potential to deliver results better than each individually. Results: We provide a theoretical framework combining read-based phasing with genetic haplotyping, and describe a fixed-parameter algorithm and its implementation for finding an optimal solution. We show that leveraging reads of related individuals jointly in this way yields more phased variants and at a higher accuracy than when phased separately, both in simulated and real data. Coverages as low as 2× for each member of a trio yield haplotypes that are as accurate as when analyzed separately at 15× coverage per individual. Availability and Implementation: https://bitbucket.org/whatshap/whatshap Contact: [email protected]

    An Algorithm to Compute the Character Access Count Distribution for Pattern Matching Algorithms

    Get PDF
    We propose a framework for the exact probabilistic analysis of window-based pattern matching algorithms, such as Boyer--Moore, Horspool, Backward DAWG Matching, Backward Oracle Matching, and more. In particular, we develop an algorithm that efficiently computes the distribution of a pattern matching algorithm's running time cost (such as the number of text character accesses) for any given pattern in a random text model. Text models range from simple uniform models to higher-order Markov models or hidden Markov models (HMMs). Furthermore, we provide an algorithm to compute the exact distribution of \emph{differences} in running time cost of two pattern matching algorithms. Methodologically, we use extensions of finite automata which we call \emph{deterministic arithmetic automata} (DAAs) and \emph{probabilistic arithmetic automata} (PAAs)~\cite{Marschall2008}. Given an algorithm, a pattern, and a text model, a PAA is constructed from which the sought distributions can be derived using dynamic programming. To our knowledge, this is the first time that substring- or suffix-based pattern matching algorithms are analyzed exactly by computing the whole distribution of running time cost. Experimentally, we compare Horspool's algorithm, Backward DAWG Matching, and Backward Oracle Matching on prototypical patterns of short length and provide statistics on the size of minimal DAAs for these computations

    Developing Recommendations for a Management Plan in La Playuela Beach, Puerto Rico

    Get PDF
    This project, sponsored by the Department of Natural and Environmental Resources (DNER), seeks to propose aspects of a new management plan for La Playuela in Cabo Rojo, Puerto Rico. The project team analyzed the current state of the environment as well as social impacts on La Playuela. A proposal outlining steps to decrease environmental impact was presented to the DNER on December 14th, 2015. The team suggested a vehicle limit of 136 cars and additional management strategies for La Playuela

    Repeat- and Error-Aware Comparison of Deletions

    Get PDF
    Motivation: The number of reported genetic variants is rapidly growing, empowered by ever faster accumulation of next-generation sequencing data. A major issue is comparability. Standards that address the combined problem of inaccurately predicted breakpoints and repeat-induced ambiguities are missing. This decisively lowers the quality of ‘consensus’ callsets and hampers the removal of duplicate entries in variant databases, which can have deleterious effects in downstream analyses. Results: We introduce a sound framework for comparison of deletions that captures both tool-induced inaccuracies and repeat-induced ambiguities. We present a maximum matching algorithm that outputs virtual duplicates among two sets of predictions/annotations. We demonstrate that our approach is clearly superior over ad hoc criteria, like overlap, and that it can reduce the redundancy among callsets substantially. We also identify large amounts of duplicate entries in the Database of Genomic Variants, which points out the immediate relevance of our approach. Availability and implementation: Implementation is open source and available from https://bitbucket.org/readdi/readd

    Fully-sensitive Seed Finding in Sequence Graphs Using a Hybrid Index

    No full text

    Genetic targeting of B-Raf(V600E) affects survival and proliferation and identifies selective agents against BRAF-mutant colorectal cancer cells

    Get PDF
    Background: Colorectal cancers carrying the B-Raf V600E-mutation are associated with a poor prognosis. The purpose of this study was to identify B-Raf(V600E)-mediated traits of cancer cells in a genetic in vitro model and to assess the selective sensitization of B-Raf(V600E)-mutant cancer cells towards therapeutic agents. Methods: Somatic cell gene targeting was used to generate subclones of the colorectal cancer cell line RKO containing either wild-type or V600E-mutant B-Raf kinase. Cell-biologic analyses were performed in order to link cancer cell traits to the BRAF-mutant genotype. Subsequently, the corresponding tumor cell clones were characterized pharmacogenetically to identify therapeutic agents exhibiting selective sensitivity in B-Raf(V600E)-mutant cells. Results: Genetic targeting of mutant BRAF resulted in restoration of sensitivity to serum starvation-induced apoptosis and efficiently inhibited cell proliferation in the absence of growth factors. Among tested agents, the B-Raf inhibitor dabrafenib was found to induce a strong V600E-dependent shift in cell viability. In contrast, no differential sensitizing effect was observed for conventional chemotherapeutic agents (mitomycin C, oxaliplatin, paclitaxel, etoposide, 5-fluorouracil), nor for the targeted agents cetuximab, sorafenib, vemurafenib, RAF265, or for inhibition of PI3 kinase. Treatment with dabrafenib efficiently inhibited phosphorylation of the B-Raf downstream targets Mek 1/2 and Erk 1/2. Conclusion: Mutant BRAF alleles mediate self-sufficiency of growth signals and serum starvation-induced resistance to apoptosis. Targeting of the BRAF mutation leads to a loss of these hallmarks of cancer. Dabrafenib selectively inhibits cell viability in B-Raf(V600E) mutant cancer cells
    • …
    corecore